Data Visualization Assignment 3
In this report we will analyze the overall progress of the LEGO
company. Specifically, we will delve into marketing strategy differences
over the years, the technology advancement of production line and the
creativity connected with creating and expanding the themes of LEGO
sets. We are going to perform a very in-depth analysis that will provide
enough information to foresee the future which LEGO company will have
ahead of itself.
Our first few experiments prove that LEGO has drastically changed in
terms of marketing. Data shows that LEGO is starting to slowly pull more
adult audience towards itself and most likely this trend is going to
continue in the future, as such strategy is far more profitable for the
company. As this strategy works very well, the LEGO bricks have been in
demand. That infers that the technological advancement of production of
bricks is also very much sufficient, as far more blocks are produced in
a large variety of colors making it enough to cover any shortcomings.
This advancement has been so high in the last few years that we expect
it to slowly die out, eventually becoming constant. Additionally, the
representation of hierarchy of themes of LEGO sets allowed us to notice
some interesting dependencies. Above all, we noticed that modern themes
mostly do not have subthemes yet, that is because LEGO prefers to be
safe and firstly create multiple discrete themes and then delve deeper
into only a few that have met public expectations. This rule is likely
not going to change, as it has always worked like that.
We recommend performing more specified research about the exact themes
that has been expanded upon, as this information may prove to be useful
to create successful future sets. On the other hand, we also recommend
analyzing themes that have not succeed, to avoid similar mistakes.
This report is produced as an assignment for the Data Visualization course. We will analyze the overall progress of the LEGO company. Specifically, we will delve into marketing strategy differences over the years, the technology advancement of production line and the creativity connected with creating and expanding the themes of LEGO sets. It is well known that, LEGO has grown since the beginning, but we will perform an in-depth analysis of topics mentioned above, measuring exactly what the growth is. Additionally, we will try to draw conclusions regarding decisions of head executives of the company based on our observations. What is more, we are going to try predicting the future growth and the direction in which this company may go. Furthermore, we are going to present possible further areas of research.
To perform following experiments we have used a rebrickable data set available here. The specific schema diagram for LEGO data files is represented below (picture from the rebrickable website).
Additionally, to perform all necessary calculations and graph creations we have used R and the following libraries:
Data files have to be placed in a separate directory called Data, that should be created in the same directory as the source file.
Keep in mind that this report was created mainly to be represented as an html version, therefore pdf output may not work.
This graph represents the quantities of sets according to the amount
of parts included in them. Each bar on this plot represents the interval
from \(100 * n\) to \(100 * (n + 1)\). The plot has been limited
to only show sets with the number of parts below 1500, as the sets above
this number are sparse and do not provide any useful information.
This visualization shows that most sets that are produced do not exceed
200 blocks. What is more, we can notice that we can more or less divide
the sizes of the sets into 3 different groups, with accordance to the
production frequency. First one - most popular up to 200 blocks, second
- less popular up to 1000 bricks and the least popular group - above
1000. The relations between these groups will be tested in the further
experiments.
This graph represents the amount of different sets produced each year
and their classification into three previously mentioned groups. Each
bar on this plot represents a single year of LEGO existence. The bar is
divided by different colors which can be interpreted according to the
legend.
This visualization shows that the number of sets was increasing every
year, reaching the peak value in 2021. Additionally, it once again shows
that the sets with more than 1000 blocks are the rarest, while the sets
with less than 200 bricks are the most common. This graph, however, does
not allow us to easily examine the exact proportions of these three
groups in given years. This knowledge is absolutely fundamental to
understand the main target market of the LEGO company, the next
experiment will shed some light on this topic.
This graph represents the proportion of the three groups of set sizes
throughout the years. Each bar on this plot represents a single year of
LEGO existence and is divided by different colors which can be
interpreted according to the legend. There are two years, where no
information of available sets is provided.
This plot shows that the number of small sets (\(<200\)) was the most popular every year,
but some interesting conclusions can be derived. First of all, LEGO
decided to produce medium sets (\(200<x<1000\)) back in 1960s and from
that point onward the proportion of these sets was mostly growing. What
is more, LEGO started production of sets bigger than 1000 elements in
about 1990s and the proportion of these sets is also growing ever since.
That means that LEGO is taking the opportunity to grow the possible
consumers age range by producing sets targeted towards older audiences,
while still maintaining the production of sets specifically created for
children and teenagers.
This graph represents the amount of various bricks used in sets
produced in given years and the distribution of their colors. Each bar
on this plot represents a single year of LEGO existence and is divided
by different colors which represent the exact color of the bricks
produced. Transparency of blocks and other special features were
omitted.
After analyzing the marketing strategy of LEGO, it is also important to
analyze technological growth of the production line of the bricks
themselves. This plot shows that the variety of blocks has been growing
almost at all times, and the different colors used now cover nearly all
hues that can be distinguished by human eye, which was not the case in
the beginning. It proves that the company developed sufficient
technology to produce large variety of blocks and their colors in a
large scale making it possible to reach larger audiences.
This graph represents the amount of themes used per year and whether
the themes have a parent or not. Each bar on this plot represents a
single year of LEGO existence and is divided into two subcategories,
that can be interpreted according to the legend. There is also a zoom on
the early days of LEGO to be able to see more clearly how the themes
were distributed back then.
Seeing the advancement of technology as well as the marketing, we can
take a look at how the amount of themes varied along these years. This
graph shows that the diversity of themes has been growing throughout the
whole lifetime of LEGO. Additionally, we can notice that there are more
and more standalone themes produced, while still maintaining the
expansion of previously released themes. Interestingly, the themes
produced at the very beginning had parent themes, what may infer a
couple of possibilities. Either the data was not correctly classified,
the data does not contain all the information or most possibly, themes,
in which sets were produced in the early days, were later classified to
be a subclass of a newer theme. We are going to explore the theme
hierarchy in the next experiment to draw more accurate and in-depth
conclusions.
This graph visualizes the expansions of themes and the years of the
releases of the first sets of given themes. Each circle represents a
theme and may contain other circles inside, which represent the
subthemes. What is more, color of the rim of each circle represents the
year that a first set of given theme was created. The year value can be
read from the legend on the side. Sadly names are not included as the
graph would be unreadable.
As mentioned in the previous test, there were some unclear information
presented on the graph. This representation shows us that the most
probable theory is indeed correct. We can notice that some older themes
are categorized as subthemes of newer ones. Additionally, we notice that
majority of themes were not expanded upon, that suggests that consumers
of LEGO products get bored quickly and prefer to get a large variety of
themes than to delve deeper into a single theme. There are a few
exceptions that have multiple levels of hierarchy, suggesting that some
specific topics are liked by the public and require LEGO to proceed
creating even more specific subsets of these themes. On top of that we
can notice that usually older themes are expanded, while most of the
newer sets are either subthemes of older themes, or completely new
themes that have yet to be expanded. Interestingly, there is a single
group that has been already expanded at the early days of LEGO and did
not receive any more expansions. To see specific details of any
interesting cases, we have prepared an additional
representation that is interactive, meaning the graph even with
names is far more readable.
This graph represents the same information that is represented in Graph 6 with addition of specific names of themes so that research can be conducted according to specified themes. To see deeper levels of hierarchy, just click on the specific subcategory. To leave the zoom click outside of the circle. Some information regarding the themes that have not been expanded yet is still hard to read but if clicked correctly, one may see the title of these themes.
Our experiment accuracy and significance is solely dependent on the data set provided - our data set possibly had some missing information, but still most significant data was present. We are assured of the great accuracy by the facts that can be derived from the graphs which are closely related with the current and previously noticed trends. Our first few experiments try to evaluate, what one of those trends may be, and they do prove that LEGO has drastically changed in terms of marketing. Data that has been gathered and represented by graphs shows that LEGO has changed their strategy and is starting to slowly pull more adult audience towards itself and most likely this trend is going to continue in the future, as such strategy is far more profitable for the company. Additionally, we can also notice that far more sets are being sold now, thus the marketing strategy clearly is working. This fact depicts the technological advancement of production of bricks, as the demand is growing, while LEGO is still able not only to produce enough of bricks, but also come up with new shapes and colors almost every year. However, we believe that this growth has nearly stopped and will become almost constant in the near future. That is due to the fact of how quickly the number of different bricks grew in the past few years, it is ought to stop, as the human creativity has its limits. What is more, in the year 2022 we may already notice the a slight drop of number of blocks. Furthermore, creativity can be also tested by depicting the hierarchy of themes. That representation allowed us to notice some interesting dependencies. For example LEGO does not forget about the older themes, they tend to either mark the old ones as subthemes for newer themes, though that is rare, or they extend the old ones by new ideas, which is far more popular. We may also notice that modern themes mostly do not have subthemes yet, that is because LEGO prefers to be safe and firstly create multiple discrete themes and then delve deeper into only a few that have met public expectations. That has been their strategy from the very beginning and it seems to work perfectly, thus we expect that not much is going to change in this matter. There are just a few exceptions to this rule, but these exceptions form a very small part of the whole theme space.
The LEGO company seems to have picked all the right decisions throughout history and continues to do so. The current marketing strategies are very well planned, and the progression of bricks production is closely related to this this plan, thus no problems of unavailability or overproduction will be encountered. The company is ought to grow more, but the expansion is going to take slower and slower until stopping fully. We suggest performing more specified research about given themes that can be described as “liked” by the public, judging by the number of subcategories of given themes, and why some old sets have failed to expand. This information may be crucial to develop future themes that would be made to succeed.